A large database DNA sequence handling program with generalized searching specifications

نویسنده

  • Peter A. Stockwell
چکیده

The program described allows for the creation and manipulation of files of DNA sequence data up to very great lengths. The program uses its own paging system to load segments of the sequence into a small internal buffer so that the program does not have excessive memory requirements. The program offers a menu of functions to the user, and has been written to be forgiving of user errors. A code for the generalised specification of bases as a series of groups (i.e. A or T, Purine, etc.) has been devised and can be used in search specifications or in sequence files. Versions of the program have been developed to run with special efficiency under DIGITAL's RT11 operating system or to run under systems with a suitable implementation of FORTRAN VI.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Processing and population genetic analysis of multigenic datasets with ProSeq3 software

MOTIVATION The current tendency in molecular population genetics is to use increasing numbers of genes in the analysis. Here I describe a program for handling and population genetic analysis of DNA polymorphism data collected from multiple genes. The program includes a sequence/alignment editor and an internal relational database that simplify the preparation and manipulation of multigenic DNA ...

متن کامل

SubtiList: the reference database for the Bacillus subtilis genome

SubtiList is the reference database dedicated to the genome of Bacillus subtilis 168, the paradigm of Gram-positive endospore-forming bacteria. Developed in the framework of the B.subtilis genome project, SubtiList provides a curated dataset of DNA and protein sequences, combined with the relevant annotations and functional assignments. Information about gene functions and products is continuou...

متن کامل

Indexing Strategies for Rapid Searches of Short Words in Genome Sequences

Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy...

متن کامل

SubtiList: a relational database for the Bacillus subtilis genome.

In the framework of the international collaborative project aiming to sequence the whole Bacillus subtilis chromosome, we have created a relational database for managing and analysing information associated with the molecular genetics of this bacterium: SubtiList. It allows recovery of non-redundant DNA sequences of the B. subtilis genome, as well as related information, i.e. genes, proteins, e...

متن کامل

Mining patterns and rules for software specification discovery

Software specifications are often lacking, incomplete and outdated in the industry. Lack and incomplete specifications cause various software engineering problems. Studies have shown that program comprehension takes up to 45% of software development costs. One of the root causes of the high cost is the lack-of documented specification. Also, outdated and incomplete specification might potential...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic acids research

دوره 10 1  شماره 

صفحات  -

تاریخ انتشار 1982